{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<style>\n",
    "pre {\n",
    " white-space: pre-wrap !important;\n",
    "}\n",
    ".table-striped > tbody > tr:nth-of-type(odd) {\n",
    "    background-color: #f9f9f9;\n",
    "}\n",
    ".table-striped > tbody > tr:nth-of-type(even) {\n",
    "    background-color: white;\n",
    "}\n",
    ".table-striped td, .table-striped th, .table-striped tr {\n",
    "    border: 1px solid black;\n",
    "    border-collapse: collapse;\n",
    "    margin: 1em 2em;\n",
    "}\n",
    ".rendered_html td, .rendered_html th {\n",
    "    text-align: left;\n",
    "    vertical-align: middle;\n",
    "    padding: 4px;\n",
    "}\n",
    "</style>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Frequently Asked Questions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### I have a massive CSV file which I can not fit all into memory at one time. How do I convert it to HDF5?\n",
    "\n",
    "Such an operation is a one-liner in Vaex:\n",
    "\n",
    "```\n",
    "df = vaex.from_csv('./my_data/my_big_file.csv', convert=True, chunk_size=5_000_000)\n",
    "```\n",
    "\n",
    "When the above line is executed, Vaex will read the CSV in chunks, and convert each chunk to a temporary HDF5 file on disk. All temporary will files are then concatenated into a single HDF5, and the temporary files deleted. The size of the individual chunks to be read can be specified via the `chunk_size` argument. \n",
    "\n",
    "For more information on importing and exporting data with Vaex, please refer to please refer to [the I/O example page](example_io.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Why can't I open a HDF5 file that was exported from a `pandas` DataFrame using `.to_hdf`?\n",
    "\n",
    "When one uses the `pandas` `.to_hdf` method, the output HDF5 file has a row based format. `Vaex` on the other hand expects column based HDF5 files. This allows for more efficient reading of data columns, which is much more commonly required for data science applications. \n",
    "\n",
    "One can easily export a `pandas` DataFrame to a `vaex` friendly HDF5 file:\n",
    "```\n",
    "vaex_df = vaex.from_pandas(pandas_df, copy_index=False)\n",
    "vaex_df.export_hdf5('my_data.hdf5')\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
